NB-FEB: An Easy-to-Use and Scalable Universal Synchronization Primitive for Parallel Programming

نویسندگان

  • Phuong Hoai Ha
  • Philippas Tsigas
  • Otto J. Anshus
چکیده

This paper addresses the problem of universal synchronization primitives that can support scalable thread synchronization for large-scale many-core architectures. The universal synchronization primitives that have been deployed widely in conventional architectures, are the compare-and-swap (CAS) and load-linked/store-conditional (LL/SC) primitives. However, such synchronization primitives are expected to reach their scalability limits in the evolution to many-core architectures with thousands of cores. We introduce a non-blocking full/empty bit primitive, or NB-FEB for short, as a promising synchronization primitive for parallel programming on may-core architectures. We show that the NB-FEB primitive is universal, scalable, feasible and convenient to use. NB-FEB, together with registers, can solve the consensus problem for an arbitrary number of processes (universality). NB-FEB is combinable, namely its memory requests to the same memory location can be combined into only one memory request, which consequently mitigates performance degradation due to synchronization "hot spots" (scalability). Since NB-FEB is a variant of the original full/empty bit that always returns a value instead of waiting for a conditional flag, it is as feasible as the original full/empty bit, which has been implemented in many computer systems (feasibility). The original full/empty bit is well-known as a special-purpose primitive for fast producer-consumer synchronization and has been used extensively in the specific domain of applications. In this paper, we show that NB-FEB can be deployed easily as a general-purpose primitive. Using NB-FEB, we construct a non-blocking software transactional memory system called NBFEB-STM, which can be used to handle concurrent threads conveniently. NBFEB-STM is space efficient: the space complexity of each object updated by N concurrent threads/transactions is Θ(N), the optimal.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

NB-FEB: A Universal Scalable Easy-to-Use Synchronization Primitive for Manycore Architectures

This paper addresses the problem of universal synchronization primitives that can support scalable thread synchronization for largescale manycore architectures. The universal synchronization primitives that have been deployed widely in conventional architectures, are the compare-and-swap (CAS) and load-linked/store-conditional (LL/SC) primitives. However, such synchronization primitives are exp...

متن کامل

Declarative Synchronization

Synchronization is one of the hardest tasks in parallel programming. Traditional lock mechanisms are hard to use and error prone. Software Transactional Memory(STM) provides an easy–to–use and scalable solution to this problem. However, STM does not work well in the case that there are large shared data access in the critical section. This project introduces the design and implementation of a n...

متن کامل

A general but simple technique to handle asynchronous data-parallel control structures

Nowadays, most of distributed architectures are easily scalable MIMD (Multiple Instruction streams, Multiple Data streams) parallel computers or networks of workstations. The challenge consists in taking advantage of the power of these architectures. It has been shown that dataparallel languages offer both a programming model easy to understand and several execution models which are able to exp...

متن کامل

Resizable, Scalable, Concurrent Hash Tables

We present algorithms for shrinking and expanding a hash table while allowing concurrent, wait-free, linearly scalable lookups. These resize algorithms allow the hash table to maintain constant-time performance as the number of entries grows, and reclaim memory as the number of entries decreases, without delaying or disrupting readers. We implemented our algorithms in the Linux kernel, to test ...

متن کامل

Fast Synchronization on Scalable Cache-Coherent Multiprocessors using Hybrid Primitives

This paper presents a new methodology for implementing fast synchronization on scalable cache-coherent multiprocessors, through the use of hybrid primitives. Hybrid primitives leverage commodity hardware to speed-up the execution of the atomic remote Read-Modify-Write (RMW) instructions employed in synchronization algorithms to resolve contending processors, while exploiting the caches to reduc...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • CoRR

دوره abs/0811.1304  شماره 

صفحات  -

تاریخ انتشار 2008